Revisiting beta‐2 microglobulin as a prognostic marker in diffuse large B‐cell lymphoma

Abstract Background Several clinical prognostic models for diffuse large B‐cell lymphoma (DLBCL) have been proposed, including the most commonly used International Prognostic Index (IPI), the National Comprehensive Cancer Network IPI (NCCN‐IPI), and models incorporating beta‐2 microglobulin (β2M). However, the role of β2M in DLBCL patients is not fully understood. Methods We identified 6075 patients with newly diagnosed DLBCL treated with immunochemotherapy registered in the Danish Lymphoma Registry. Results A total of 3232 patients had data available to calculate risk scores from each of the nine considered risk models for DLBCL, including a model developed from our population. Three of four models with β2M and NCCN‐IPI performed better than the International Prognostic Indexes (IPI, age‐adjusted IPI, and revised IPI). Five‐year overall survival for high‐ and low‐risk patients were 43.6% and 86.4% for IPI and 34.9% and 96.2% for NCCN‐IPI. In univariate analysis, higher levels of β2M were associated with inferior survival, higher tumor burden (advanced clinical stage and bulky disease), previous malignancy and increased age, and creatinine levels. Furthermore, we developed a model (β2M‐NCCN‐IPI) by adding β2M to NCCN‐IPI (c‐index 0.708) with improved discriminatory ability compared to NCCN‐IPI (c‐index 0.698, p < 0.05) and 5‐year OS of 33.1%, 56.2%, 82.4%, and 96.4% in the high, high‐intermediate, low‐intermediate and low‐risk group, respectively. Conclusion International Prognostic Indices, except for NCCN‐IPI, fail to accurately discriminate risk groups in the rituximab era. β2M, a readily available marker, could improve the discriminatory performance of NCCN‐IPI and should be re‐evaluated in the development setting of future models for DLBCL.


| INTRODUCTION
Diffuse large B-cell lymphoma (DLBCL) is the most frequent type of non-Hodgkin lymphoma, accounting for approximately 30% of all lymphoid malignancies. 1LBCL is characterized by significant heterogeneity in survival, and 30%-40% of patients are primary refractory or relapse following standard therapy. 2 Accurate estimation of outcomes after standard therapy is of interest to tailor therapies and experimental approaches to patients' risk profiles.
The International Prognostic Index (IPI) was developed in 1993 as a prognostic tool for patients with aggressive lymphoma treated with doxorubicin. 3Patients were stratified into four risk groups based on five simple variables: age, Ann Arbor stage, Eastern Oncology Cooperative Group performance status (ECOG PS), number of extranodal sites, and lactate dehydrogenase (LDH).With the introduction of rituximab, several studies indicated suboptimal predictive power of the IPI in rituximabtreated patients. 4The National Comprehensive Cancer Network IPI (NCCN-IPI), proposed in 2014, is one of the most prominent models developed for DLBCL patients in the rituximab era. 5 This model is based on the same five variables as the IPI but with four strata for age and three for LDH levels and refinement of high-risk localizations.][7] However, NCCN-IPI does not include markers that reflect the molecular-genetic or other fingerprints of the tumor or the associated microenvironment, even though many new prognostic factors have been associated with survival, including clinical, radiologic, biological, cytogenetic, and molecular markers. 5However, including more sophisticated biomarkers in models for routine daily practice is difficult as they are time-consuming to measure and associated with added costs. 8any readily available biomarkers have been associated with survival in lymphoma patients, including beta-2 microglobulin (β2M), a small polypeptide light chain that forms part of the major histocompatibility complex (MHC) class I antigens. 8The β2M is encoded by the B2M gene, which can be altered by different mechanisms, leading to impaired β2M protein expression.Additionally, B2M gene alterations can accumulate during cancer progression, contributing to poor reaction to cancer immunotherapies by dampening antigen presentation.In lymphoid malignancies, elevated serum β2M levels have been found in patients with significant tumor burden because β2M is ubiquitously expressed in most nucleated tumor cells, and white-blood cell membrane turnover is the primary source of serum β2M. 8β2M has been related to inferior survival in patients with multiple myeloma and is part of the International Scoring System (ISS) and Revised ISS (R-ISS). 9,10][13] The prognostic significance of β2M has also been investigated in patients with DLBCL treated with and without rituximab. 14,15Already in 2000, Conconi et al. tried to improve the prognostic value of the IPI and developed a prognostic model combining β2M and IPI variables (β2M-IPI) in DLBCL patients treated with CHOP (cyclophosphamide, doxorubicin, vincristine, and prednisone)based regimes. 15][18] The most extensive study to date used to develop and validate a model with β2M came from a Spanish group (Grupo Español de Linfomas y Trasplantes de Médula Ósea-GELTAMO) that developed GELTAMO-IPI in a population of 1848 patients treated with rituximab-based regimes. 8,19Attempts to combine β2M with clinical, laboratory immunohistochemical, and molecular markers have been made, but the validation of these models is lacking. 8,20,21e conducted a population registry-based study to evaluate the prognostic role of β2M in, so far, the largest number of DLBCL patients.We aimed to (1)

| Patients
Patients analyzed in this study were identified through the nationwide Danish lymphoma registry (LYFO) as part of the Danish validation project. 22Patients newly diagnosed with DLBCL aged 18 years and older, diagnosed between January 2000 and June 2021, and treated with at least one cycle of rituximab plus CHOP (R-CHOP)/R-CHOP-like therapy were screened for inclusion.Only patients with all clinical and laboratory variables required by analyzed prognostic models were included in the final analysis.Patients with testicular involvement, systemic disease, and concomitant CNS involvement were included in the study.Patients were excluded if they were diagnosed with primary central nervous system (CNS) lymphoma.
Prognostic models of interest were identified through a literature search and a previous systematic review. 4,23

| Statistical analysis
The primary survival outcome was OS measured from diagnosis until death from any cause or censoring at the last follow-up.Progression-free survival (PFS) was calculated from diagnosis until relapse/disease progression, death, or censoring at the last follow-up.Survival was estimated using the Kaplan-Meier estimator, and differences between survival curves were tested using the log-rank method.Cox proportional hazard models estimated associations between individual variables and OS and PFS.Logistic regression was used to measure the effects of explanatory variables on β2M, which was dichotomized according to the upper limit of normal (≤ULN vs. >ULN).
Integrated Brier score (IBS) was used to measure overall performance, with 0 indicating that prediction and outcome are equal and 1 indicating discordant prediction. 24Akaike information criterion (AIC) and Bayesian information criterion (BIC) were used as measures of fitness, with lower values indicating better fit. 25Moreover, the area under the receiver operating characteristic curve (AUC) was used to assess the discrimination of the predictive models. 268][29] Calibration (agreement between predicted and actual probabilities estimated by a predictive model) was presented with calibration curves.Models close to a 45-degree line show perfect calibration. 30nterrater-weighted κ statistics and 95% confidence intervals (CI) were used to compare agreement between the NCCN-IPI and other four-risk models. 31We used t-test, Mann-Whitney, or chi-square tests when appropriate to test the hypothesis for differences.
To develop a new model incorporating β2M, the population was randomly split into a training cohort comprising two-thirds of the analyzed population and a validation cohort using the other one-third of the population.The cutoff for β2M, according to ULN, was chosen as previously reported by the GELTAMO group that developed the model with β2M from the largest patient population published so far.NCCN-IPI variables were combined with β2M (≤ULN vs. >ULN), resulting in the risk score (β2M-NCCN-IPI) ranging from 0 to 9 points.Patients were regrouped into low (0-1 points), low-intermediate (2-4), high-intermediate (5-6), and high (>6 points) risk groups (Table S1).Several regrouping combinations in the training cohort were tested, with the one producing the highest c-index selected.Only patients with all available variables of interest required by analyzed models were included, as multiple imputations would not reasonably approximate the true distributional relation between unobserved data and available information. 32ll p-values were two-sided, and p < 0.05 was considered statistically significant.All analyses were performed in IBM SPSS Statistics 22 (IBM Corporation, Armonk, NY, USA), including randomization when developing the model and R-4.0.0 software (The R Foundation for Statistical Computing, Vienna, Austria) using the following packages: CPE, ggplot2, ggsurvfit, dynpred, maxstat, pec, rms, survC1, and survival.

| RESULTS
Of 6075 patients with DLBCL registered in LYFO who received at least one cycle of R-CHOP/R-CHOP-like therapy in the inclusion period, only 382 patients lacked IPI/NCCN-IPI variables, while one lacked data on vital status.Among 5692 potential candidates, data to calculate required prognostic indices of interest were available for 3232, while 2460 patients were excluded due to missing data on β2M (Figure 1).
The median age of the analyzed population was 68 years (range 18-95), with 71.3% older than 60.Table 1 summarizes the baseline patient characteristics of 3232 patients included in the final analysis.Additionally, clinical characteristics of the excluded patients with available IPI/NCCN-IPI parameters are provided as comparisons.Of note, among 2460 excluded patients (43.2%), there was no difference regarding age and gender compared to patients included in the final analysis.However, differences were observed regarding other analyzed variables, as provided in Table 1.

| Prognostic models
Table S1 summarizes the variables included in each model along with distributions of patients according to risk categories in original models for a more straightforward overview.Table 2 provides distributions of patients from our cohort concerning risk groups and 3-and 5-year PFS and OS according to each model.

| Variables included in models and correlations with β2M
The disease stage was the only IPI parameter used in all models, while all models except aaIPI included age in the model.Moreover, gradations into several age groups were used in NCCN-IPI and GELTAMO-IPI.ECOG PS and LDH were used in seven models, with the gradation of LDH used only in NCCN-IPI.Extranodal localizations were combined with other parameters in five models.
β2M was part of four previously reported models, with two models dichotomizing β2M according to the ULN and two according to an optimal cutoff (Table S1).
Among all analyzed parameters (age, stage, ECOG PS, extranodal localizations, LDH, previous malignancy, bulky disease, and creatinine levels), only gender was not associated with β2M levels.When excluding gender from further analysis, multivariate logistic regression analysis showed that all parameters influenced levels of β2M (Table S2).
As presented in Table S3, we analyzed the agreement between the NCCN-IPI as the reference model and other models stratifying patients into four risk groups.Based on the weighted κ analysis, substantial agreement (weighted κ between 0.61 and 0.80) was only observed between NCCN-IPI and the IPI (0.630) and GELTAMO-IPI (0.612).

| Overall survival (OS)
The median follow-up of the study population was 59.5 months (range 0.30-228.6 months).There were 1283 Parameters included in each model were prognostically significant in univariate analysis (Table 3).However, when β2M was combined with individual IPI variables, EN localizations lost prognostic significance.On the contrary, multivariate analysis with β2M and NCCN-IPI variables showed that all parameters retained prognostic significance.
Figure 2 presents Kaplan-Meier and calibration curves for International Prognostic Indices, while Figure 3 presents the same for models with β2M.Table 3 summarizes calculated 3-and 5-year OS rates for all models.Although most models did not identify patients with a high risk of T A B L E 2 Distribution of patients within risk groups of analyzed prognostic models for Diffuse large B-cell lymphoma patients concerning 3-and 5-year overall and progression-free survival.
3.4.2| Progression-free survival (PFS) The median PFS was 139.8 months (95% CI, 128.3-151.3months).Table S4 provides hazard ratios (HRs) for PFS for risk groups within each prognostic model.Kaplan-Meier and calibration curves were similar to those of OS (data not provided).

Multivariate analysis with NCCN-IPI variables and β2M
Similar results were obtained for PFS, in terms of overall performance measures, fitness, and discrimination, as for OS.Of note, all four models with β2M had higher cindex than IPI, aaIPI, and R-IPI but did not outperform NCCN-IPI (Table S5).

| Model development
As provided in Table S6, the new model (β2M-NCCN-IPI) showed improved performance measures compared to NCCN-IPI and IPI in both the training (2155 patients) and validation cohorts (1077 patients).When analyzing all 3232 patients together, they were categorized into lowrisk (8.0%), low-intermediate (48.1%), high-intermediate (32.7%), and high-risk (11.2%) groups (Table 2).Although β2M-NCCN-IPI had almost perfect risk group agreement (weighted κ = 0.873) with NCCN-IPI, this model provided a statistically significant improvement of the c-index regarding PFS (0.700) and OS (0.708) compared to the NCCN-IPI (Table 4; Table S4).The new model showed a more significant difference in survival rates between low-and highrisk groups than NCCN-IPI.The 5-year overall survival was 33.1% in the high-risk group and 96.4% in the lowrisk group.However, the new model could not accurately identify patients with very poor survival, but it provided improved estimates compared to NCCN-IPI.(Table 2; Figure 4).Kaplan-Meier curves and calibration curves of the new model (β2M-NCCN-IPI) are presented in Figure 4.

| DISCUSSION
In this extensive population-based analysis of prognostic models, we confirmed the prognostic value of International Prognostic Indices (IPI, aaIPI, R-IPI, and NCCN-IPI) and four models that include β2M (β2-IPI, GELTAMO-IPI, Modified Prognostic Model, and New Prognostic Index).However, IPI, aaIPI, and R-IPI showed statistically inferior performance measures compared to other models.Regarding PFS and OS, the discriminatory ability of NCCN-IPI was improved compared to models with β2M, but it did not statistically outperform these models. 5,19,33,34However, adding β2M to NCCN-IPI improves the NCCN-IPI's discriminatory ability.One of the potentially readily available markers with prognostic significance in hematological malignancies is β2M, a simple and inexpensive laboratory biomarker that has shown prognostic significance in diverse lymphoproliferative disorders. 14Moreover, β2M can be increased in patients with systemic or local inflammation and those with renal failure, as it is mainly excreted by the kidneys. 8[10][11] β2M is an important unit of MHC class I and is essential for properly functioning the MHC class I heavy chain, enhancing the ability to bind peptides. 35In cancer patients, B2M gene alterations have been related to MHC class I deficiency and loss of β2M protein expression, facilitating tumor cell escape from the host's immune control.This avoidance mechanism, tumor evasion, was recognized as one of the critical processes of tumor resistance to the cytolytic activity of T cells. 36Additionally, β2M deficiency has been connected to immune escape in melanoma and non-small-cell lung cancer patients and related to unfavorable prognosis. 36enetic alterations associated with the inactivation of the B2M gene were reported in 29% of DLBCL cases. 37everal studies have previously demonstrated the prognostic significance of β2M in DLBCL patients treated with and without rituximab. 1,8,14,15,19,38In the original study behind the IPI, β2M was shown to be an independent prognostic marker, but due to many patients with missing values, this parameter was not incorporated in the final model. 3In an attempt to investigate whether the addition of β2M increases the prognostic value of the IPI, Conconi et al. proposed β2-IPI. 15Due to missing values, this model was developed in only 71 patients treated with CHOP therapy. 15Montalbán et al. added a β2M to the main variables of IPI and improved risk assessment in DLBCL in a study that proposed GELTAMO-IPI. 19lthough parameters included in the IPI were significant in our univariate analysis, only extranodal sites lost their prognostic significance when adding β2M in multivariate analysis.Interestingly, GELTAMO-IPI used β2M instead of extranodal sites, with improved discriminatory ability in the development (1230 patients) and validation settings (618 patients). 190][41] Compared to β2-IPI and GELTAMO-IPI, which dichotomized β2M concerning the ULN, the other two models incorporated dichotomized β2M value according to the optimal cutoff evaluated by AUC.Kanemasa et al. proposed a four-level model based on age, stage, ECOG PS, and β2M cutoff of 3.2 mg/L, while Kang et al. combined the same parameters with the addition of LDH and different β2M cutoff of 2.5 mg/L. 16,17Both models were developed from retrospective studies with a limited number of patients (274 and 621, respectively). 16,17lthough all four-level models could stratify patients into four risk groups, only three models could identify populations with poor outcomes and 5-year survival of 35% or less in high-risk groups, including GELTAMO-IPI (29.5%),New Prognostic Model (30.0%), and NCCN-IPI (34.9%).In comparison, the 5-year OS for IPI was 42.6%.Regarding patients in the low-risk group, NCCN-IPI could identify most patients with excellent prognoses with a 5year OS of 96.2%, followed by Modified Prognostic Model (96.0%),GELTAMO-IPI (95.4%), and New Prognostic Model (95.3%).R-IPI showed good stratification ability of low-risk patients with 5-year survival comparable to NCCN-IPI.
One of the most extensive validation studies involving 2124 patients from seven clinical trials revealed that all International Prognostic Indices had lower prognostic ability than initially reported. 42However, in our recent study comparing 13 models developed for DLBCL, including models combining IPI variables with different laboratory parameters, NCCN-IPI consistently showed superior performance than other analyzed models. 7The study included 5126 patients with DLBCL, resulting in similar 5-year survival rates to the original NCCN-IPI.This may be due to both cohorts being representative of real-life populations. 7As β2M frequently lacked, comparisons regarding model performance with models including β2M were not conducted.In the current analysis comparing performance measures among International Prognostic Indices and models incorporating β2M, we found superior discriminatory ability evaluated by the c-index of NCCN-IPI (0.698), GELTAMO-IPI (0.697), and New Prognostic Index (0.693) compared to IPI (0.677), aaIPI (0.18), and R-IPI (0.646).However, NCCN-IPI did not statistically outperform models with β2M, nor did GELTAMO-IPI show improved discriminatory ability compared to NCCN-IPI.When other measures of model performance were calculated, including calibration concerning PFS and OS, NCCN-IPI and β2Mbased models again provided better performance than other International Prognostic Indices.
Coutinho et al. conducted a validation study on 386 patients uniformly treated with R-CHOP, concluding that NCCN-IPI was better than IPI and GELTAMO-IPI in identifying patients with a poor prognosis. 40The study found that when controlling for NCCN-IPI risk groups, bulky disease and elevated β2M were independent predictors of poor prognosis.However, adding them to NCCN-IPI did not improve the identification of patients with poor outcomes. 40To analyze whether β2M improves the prognostic significance of NCCN-IPI in our cohort, we tested a model by adding β2M to NCCN-IPI and regrouping patients into four risk groups.The new model named β2M-NCCN-IPI provided superior performance measures than other models with 5year survival in high-and low-risk groups of 33.1% and 96.4%, respectively.The improvement of NCCN-IPI by adding β2M reflects better regrouping, particularly in low-intermediate and high-intermediate groups, with a 5-year survival of 56.2% and 82.4% in high-intermediate and low-intermediate groups, respectively, compared to 61.7% and 83.1% in the respective NCCN-IPI groups.Moreover, β2M could add prognostic power to NCCN-IPI due to the association of β2M to prognostically adverse markers such as increased age, advanced stage, extranodal involvement, increased LDH, bulky disease, secondary malignancy, and impaired kidney function.We confirm previous findings that β2M may reflect tumor burden, although other conditions could be contributing factors and should be cautiously evaluated. 21he current study's main limitation is the retrospective nature, and many excluded patients due to missing data on β2M, which could impact the results.Excluded patients were more often diagnosed in earlier years of the study selection period.Moreover, they tended to have more adverse IPI/NCCN-IPI factors, although no differences regarding age and gender were observed.This finding may reflect a failure to complete all diagnostic and prognostic assessments, including β2M in patients presenting with high tumor volume, rapidly progressing disease, and negatively affected performance status in urgent need of therapy.As we lacked data on cell of origin and chromosomal translocations (MYC and BCL2 or/and BCL6), we were unable to compare models integrating biologic prognostic markers reflecting recent advances in DLBCL's genomics, molecular biology, immunology, and radiology, further limiting comparisons with other potentially relevant models. 8,20,21ne limitation of the used registry is the lack of precise data on other malignancies, which hinders the thorough investigation of their effects on β2M elevation.However, the study aimed to compare four models integrating β2M with International Prognostic Indices, validate these models in the most extensive population analyzing β2M in DLBCL, and finally develop a model showing that β2M could further increase the discriminatory ability of existing models.Moreover, we compared patients with and without available β2M.By this, we confirm a potential selection bias when many patients are excluded for different reasons despite a large, selected population.

| CONCLUSIONS
In this large retrospective nationwide register-based study, we found superior model quality and discriminatory ability of NCCN-IPI and four models incorporating β2M compared to IPI, aaIPI, and R-IPI.Four models incorporating β2M did not outperform NCCN-IPI in terms of performance measures.However, adding β2M to NCCN-IPI could provide a superior discrimination ability of the newly proposed model (β2M-NCCN-IPI) than other models.Therefore, we suggest that β2M should be considered in future studies aiming to develop models for DLBCL as it is related to factors of prognostic importance for survival in lymphoma patients reflecting greater tumor burden (advanced clinical stage, extranodal sites, bulky disease, and serum LDH level) in addition to increased age and creatinine levels.
investigate the association of β2M with other clinically relevant prognostic factors in DLBCL registered in our national database, (2) compare four β2Mbased models with four International Prognostic Indices (IPI, age-adjusted IPI [aaIPI], revised IPI [R-IPI], NCCN-IPI), and (3) investigate whether the addition of β2M to NCCN-IPI would improve the discriminatory performance of NCCN-IPI by developing a new model based on NCCN-IPI variables and β2M.

F I G U R E 2
Kaplan-Meier and calibration curves of four International Prognostic Indices in diffuse large B-cell lymphoma patients with respect to overall survival.The shaded color areas around curves represent confidence intervals.F I G U R E 3 Kaplan-Meier and calibration curves of four models with β2M in diffuse large B-cell lymphoma patients with respect to progression-free survival.The shaded color areas around curves represent confidence intervals.TA B L E 4 Summary of hazard ratios, overall performance, fit/quality, and discrimination measures concerning overall survival.

F I G U R E 4
Kaplan-Meier and calibration curves of new β2M-NCCN-IPI with respect to overall and progression-free survival.The shaded color areas around curves represent confidence intervals.
of the selection process for identifying patients eligible for the current study.